Introduction to RStudio

Michael Clark

June 13, 2016

Overview

Overview

Overview

Base R is an extremely powerful tool. However…

The syntax editor in Base R is a simple text editor and nothing more.

You should use something more efficient and easy to use.

Options:

  • RStudio
  • Emacs
  • Vim
  • Various others

Note however, there is an Emacs (and Vim) mode within RStudio.

Overview

RStudio offers:

  • Code completion and snippets
  • Code diagnostics
  • Customizable shortcuts
  • Document generation (web, pdf, presentation, .doc)
  • Web publishing

Overview

RStudio offers:

  • Enhanced debugging, profiling
  • Navigable data frames
  • Version control
  • Interactive visualization
  • Addins

Overview

RStudio is also an excellent tool for reproducible research.

  • Project management
  • Package building and freezing
  • Document generation

In short, you can go from data import to (modern, web-based) publication and replication.

Scripting

Scripting

Everyone who uses RStudio does so for easier scripting, including:

  • Syntax highlighting
  • Autocomplete of object/function names, etc.
  • Autopairing of parenthesis, quotes etc.
  • Auto indent

It even makes working at the console viable.

  • Still not advised

Keyboard shortcuts: standard scripts

Knowing a few shortcuts can save a lot of time in the long run.

Examples (Windows and Linux):

Run current line: Ctrl+Entr

Run up to/from current line: Ctrl+Alt+B/Ctrl+Alt+E

Run everything: Ctrl+Shft+Entr

Insert section: Ctrl+Shft+R

Alt+Shft+K

Keyboard shortcuts: standard scripts

Copy up/down: Ctrl+Shft+up/down

Move up/down: Shft+up/down

Yank up to cursor: Ctrl+u

Yank after cursor: Ctrl+k

Select multiple lines: Ctrl+Alt+Click

Select Window: Ctrl+1:9

Expand Window: Ctrl+Shft+1:9

Keyboard shortcuts: tab selection

Previous tab: Ctrl+F11

First tab: Ctrl+Shft+F11

Next tab: Ctrl+F12

Last tab: Ctrl+Shft+F12

Keyboard shortcuts: documents

Insert Chunk: Ctrl+Alt+i

Run Chunk: Ctrl+Alt+c

Run previous chunks: Ctrl+Alt+p

Knit document: Ctrl+shft+k

Some favorite shortcuts

Multicursor: Ctrl+Alt+select

Move lines: (shift+) alt + arrow

Clear console: Ctrl + L

Restart R: Ctrl + Shft + F10

Scripting/Console Window: Ctrl+1/Ctrl+2

Rerun previous: Ctrl + Shft + P

Run everything before: Ctrl + Alt + B

Run everything before: Ctrl + Alt + E

Knit: Ctrl + Shft + K

Keyboard shortcuts

The point is, knowing just a dozen shortcuts could save a lot of time.

  • Bonus: Ctrl+Shft+a (tidy up your code)

Mac users: most of these would use Cmd rather than Ctrl (but not always)

Snippets

Snippets allow one to insert code of a certain form for commonly used functions.

You only have to type the first couple letters, the form of the rest of the code will fill out, then you can tab your way through the rest of it.

Code Diagnostics

RStudio will note problems in your code in the margin.

  • Examples: hanging brackets, too many commas etc.

This works beyond just R scripts too!

Customization

Customization

RStudio allows one to customize various aspects of how it looks and how you interact with it.

In addition, these customizations can be specific to all RStudio sessions or for a particular project you’re working on.

  • Tools/Global Options
  • Tools/Project Options

Customization

As a starting point, one would want to maybe change the look of it.

  • Code look, indentation, highlighting etc.
  • Window pane locations

Customization

Customization

While you may certainly want to change things such as the look, to not save the workspace automatically etc., the main point is simply to be aware of what you can change.

Projects

Projects

Projects provide a self-contained ecosystem within which to work.

  • Projects have their their own working directory, workspace, etc.

If you have multiple projects, you can easily jump between them.

  • Without losing your place

Projects

File/New Project

Usually you’ll select new, but you’ll want to note the other options.

  • We’ll talk about version control later.

Projects

All tabs opened will remain open when you revisit the project.

You can have multiple projects running at the same time

  • i.e. multiple RStudio instances

Projects can be seen as a first step in getting more organized, more reproducible etc.

Rmarkdown

What is R?

R is fast becoming a general programming environment rather than just a statistical one.

Markdown is a language that allows for easier web-based documentation.

  • Not necessary to know html

Now one can intermingle R with markdown, html, css, javascript, \(\LaTeX\) and others resulting in a variety of products

Rmarkdown

  • html, pdf, doc
  • presentations (like this one)
  • dashboards
  • notebooks
  • websites
  • other publication

Example

File/New/R Markdown…

Or many of the others too.

R ‘chunks’ are interspersed throughout the Rmd file, combining code, plain text, markdown and possibly others.

An Rmarkdown workshop will be given in the future for more details.

Interactive and Visual Data Exploration

The Viewer

In addition to the “Plots” pane, RStudio also provides a “Viewer” pane.

Anything interactive will be displayed there.

Packages

ggplot2 Is the most widely used package for visualization in R.

However, it is not interactive by default.

Many packages use htmlwidgets and d3 (javascript library) to provide interactive graphics.

Some packages to note: - plotly - used also in Python, Matlab, Julia, aside from many interactive plots, can convert ggplot2 images to interactive ones.

  • ggvis
    • interactive successort to ggplot though not currently actively developed
  • rbokeh
    • like plotly, also has cross program support
  • DT
    • interactive data

Interactive Tables (View)

RStudio now gives you a sortable table with the View function.

  • View just like you would otherwise use

Example

Works in your presentations too.

Shiny

Shiny is a framework that can essentially allow you to build an interactive website.

Most of the more recently developed* packges will work specifically within the shiny and rmarkdown settings.

R has a long history of providing interactive graphics, but most of it was very poor.

Interactive Tables With ‘DT’

The DT package allows you to create interactive data tables.

These tables will open in your viewer:

  • They also work nicely within R Markdown and similar formats

    • They don’t, however, work in ioslides!

Checking Your Data

It is always a good idea to give your data a visual pass.

DT::datatable(state.x77)

Identify anomolies, nonsense values, etc.

Other Things To Check

Summaries

Correlation Tables

Some Additional Functionality

The datatable function also has an argument to filter your data.

Same as above, but with division as a string:

Interactive Plots

RStudio, in conjunction with some modern packages, lets you visually explore your data.

There are two packages that are great for this:

  • ggvis

  • rbokeh

ggvis

With ggvis, you can take some of what you know from ggplot2 and use it to create interactive, web-ready plots.

  • Instead of building layers with +, ggvis imports the %>% operator from magrittr.

ggvis

rbokeh

The rbokeh package is a port of the ‘bokeh’ package from Python.

It is very similar to ggvis, but there are some distinct differences.

An Example

Which One?

It might be easier to get up and going with rbokeh.

On the other hand, ggvis is incredibly powerful and is ready to go into any shiny app.

Quick Wrap

RStudio lets you take a deeper look at your data.

Interactive tables and plots go a long way to helping you understand your data better.

Addins

Addins

RStudio allows its users to create functions that can be used within RStudio with a click or keystroke.

These special functions are called addins.

Addins are a great way to increase your productivity and efficiency when scripting.

They can be anything, but the easiest (and perhaps most useful) example is text insertion/formatting.

Creating Addins

Addins are nothing more than R functions that you can call interactively.

insertMatrix()

[[Question:Matrix]]

[[Choices]]

[[Answers]]

Steps

To make addins available within your RStudio, you need to do the following:

  • Create an R package

  • Create R functions for the addins

  • Create a debian control file (.dcf) at a specific location

Debian Control File?

We do not need to worry about what exactly is happening with these.

Just create the following folders and file:

  • ~/inst/rstudio/addins.dcf

Register the Addins

To include the addins in the .dcf, you need to include the following information:

  • Name: What should appear as the addin name

  • Description: A brief description

  • Binding: The function name

  • Interactive: Logical for interactivity

An Example

Name: Insert [[Question:Matrix]]
Description: Inserts a matrix question into the advanced format text file.
Binding: insertMatrix
Interactive: false
  • You can include any number of addins within the file, just leave a blank line between each entry.

Interactive

We can use shiny to create interactive addins.

  • Just set interactive to true.

The addinexamples package has a few good examples of interactive addins.

R Package

Addins have to be put inside of an R package to function.

  • It could be a package with nothing but addins if you want.

Once the addins are included, they will be there until you decide to remove them.

  • You do not need to do anything special to load the addins!

Using ‘rstudioapi’

The rstudioapi package offers some functions to create addins.

The insertText function is one that you will probably use the most.

  • There are a few others that you might find useful:

  • modifyRange - does the same thing as insertText, but over ranges.

  • navigateToFile - opens a file within RStudio (you can even specify the line)

  • askForPassword - requires the user to input a password

Shortcuts

Being able to interactively call functions is handy.

To make the most out of the addins, you can assign each one a keyboard shortcut.

  • Addins > Browse Addins > Keyboard Shortcuts…

You can set them to be whatever you want!

“Gotcha’s” and Tips

Always make sure that the binding and the function name are the same.

  • You will get an error if there is not a match.

Rebuild the package if you add more functions.

It never hurts to restart your R session if you have multiple projects open.

Quick Wrap

RStudio addins can save you a lot of time on tasks that you frequently do.

Like most things, a little initial work will yield massive time savings in the future.

More Advanced

Debugging

Profiling

What is Debugging?

Debugging is merely finding and fixing problematic code.

  • Code will always have bugs

Debugging is an absolutely essential part of creating functions.

A note about functions

If you are doing anything more than twice, you should write a function instead.

  • It’s more generalizable
  • It’s more reproducible
  • Debugging can allow one to spot issues
  • RStudio can help you get started transforming existing code to a function

    • Ctrl+Alt+X on highlighted code

Debugging in RStudio

There are numerous facilities within R to help you debug your code.

  • Break Points

  • browser()

  • debugonce()

RStudio makes the process easier.

Overview

Break Points

  • used to enter into debugging at a certain line

browser()

  • Used to stop a function at a certain point and enter into debugging.

debug() debugonce()

  • Enter debugging on next function call

Debug Mode Commands

There are commands that allow you to work through debugging:

  • Next (n): runs the next line

  • Step into (s): if the next line is a new function, it enters into the function

  • Careful with this one; you can get pretty far into other functions

  • Finish (f): finishes the function

  • Continue (c): stops debugging and runs the function

  • Stop (Q): stops debugging and does not run the function

Each of these also has a button in the debugging menu

Profiling

Quick Wrap

  • Debugging is an important part of programming.
  • RStudio makes the debugging more interactive and flexible than R alone.

Version Control

Overview

RStudio offers the ability to integrate version control into your project.

  • Subversion
  • Git
  • Both are free and open

Wait, Wait! What Is Version Control?

At its most basic, it is just a way to manage changes.

  • Documents, code, etc.

Especially useful when collaborating.

  • Keep track of who is making changes and what they are changing
  • Revert changes back to an earlier version
  • Merging multiple copies of a document into one

Subversion

Subversion (SVN) is “client-server” program.

  • Users share a single repository

Git

Git works on a distributed model

  • Users create their own local repositories

Created by the folks at Linux

The Baskin Robbins of version control

  • Bitbucket
  • GitLab
  • GitHub

GitHub

GitHub is a web-based tool that allows you to upload your Git repository

Process (Briefly)

  • Commit
  • Push into repository
  • Pull from repository

The Shell

RStudio has you covered for most commands.

If you need other things, RStudio will pull up the shell for you:

  • commit –amend - Redoing a commit
  • branch - Create a new branch
  • merge - Merges the current branch into the Master
  • log (–oneline) - Shows all commits (in one line)
  • tag - Label a branch as an important one
  • rm - Remove a file
  • stash - Stash a file to avoid making changes when pulling
  • blame - Find out who made changes to a file

A Common Shell Use

Sometimes, we create our local repository before creating our remote repository in GitHub

  • Excitement often gets the best of us!

If you want to push your local repository to a new remote repository, just use the following:

Lightning Demo

Various Shenanigans

Quick Wrap

RStudio makes it easy to integrate version control into your project.

You have nothing to lose by keeping track of files and the changes that have been made to them.

This is especially useful when collaborating.

Package Development

Package Development

RStudio makes package development easy.

  • New Project > New Directory > R Package

snapshot

R Package Dialog Box

“Create package based on source files:” allows you to include previously written functions in your new package.

When the package gets created, each of the functions you added at this step will have their own help files created.

  • You will still need to complete the help files, but at least they are there.

  • Do keep in mind that you need a \(\LaTeX\) installation for help files!

What You Get

RStudio will automatically start you out with the following:

  • DESCRIPTION: Just like every R package

  • A ‘man’ folder: Contains .Rd files for each function

  • An ‘R’ folder: Contains your functions.

You might also consider adding an .md file if you want to put your package on GitHub.

Function Documentation

The roxygen2 package helps to properly format your documentation files.

It will give you pre-formatted .Rd files that already contain your arguments:

\arguments{
  \item{x}{
%%     ~~Describe \code{x} here~~
}
  \item{y}{
%%     ~~Describe \code{y} here~~
}
}

All you need to do is add explanatory text and working examples.

Build & Reload

After you have all of your files ready, you can build the package.

Check

Packages tend to have a lot happening in them.

To help you make sure that the package has everything it needs, you can run the check function from devtools on it.

It will check package quality across many dimensions:

  • Ability to install package and its dependencies

  • Checking help file quality

  • Find errors in examples

All of this testing will occur within the “Build” pane and you can see errors as they occur.

You can also look at the log file that is produced.

Quick Wrap

RStudio has built-in tools that make package creation a straight-forward process.

You should not be afraid to create your own packages.

Cheat Sheets

Cheat Sheets - RStudio Style

RStudio wants everything to be easy for us as R users.

  • They do not mind that the kids (or rapscallions as Ripley would likely call them) are playing on the lawn.

As such, they have produced a series of cheat sheets as reference material.

https://www.rstudio.com/resources/cheatsheets/

RStudio

RStudio has a cheatsheet for using RStudio!

It provides a high-level overview for many of the things we are talking about here.

It also has a comprehensive list of keyboard shortcuts.

  • Alt + Shift + K will bring them up in RStudio.
  • Shortcuts can save you a lot of time.
  • Do show care in your keystrokes…otherwise, you might find your screen rotated or your keyboard is producing Hebrew characters.

Data Visualization

It is essentially a primer on using ggplot2.

It effectively communicates the various geoms.

For the beginning ggplot2 user, the following sections are indispensable:

  • Scales
  • Coordinate Systems
  • Faceting
  • Position Adjustments

Data Wrangling

Data wrangling is essentially just a fun way of saying data cleaning and prep.

The cheat sheet offers some useful tips on using two handy packages:

  • dplyr

  • Handles all manners of data subsetting, filtering, variable selections, grouping, summarizing, etc.

  • tidyr

  • Used for reshaping data (wide to long, long to wide).

R Markdown

R Markdown is used to generate reproducible documents with R.

Your document can contain code, data, analyses, visualizations, or anything else that you want to include.

You may also include html, css, javascript, and \(\LaTeX\) in your documents.

R Markdown documents can be saved as html, pdf, or even Word documents.

R Markdown Reference Guide

R Markdown is really a combination of three different things:

  • markdown

  • The basic structure of the document (headings, sections, text)

  • knitr

  • Controls how R is used within the document

  • pandoc

  • Controls the output (html, pdf; document, presentation)

  • The last page list each format and the options that are available

Package Development

RStudio makes package development accessible to anyone.

It has many capacities for helping you to create packages:

  • automatic file creation with roxygen2

The cheat sheat details using devtools.

  • devtools was created specifically for package development

Shiny

Shiny is a web page that allows users to interact with an R session.

  • Users can interact with the data, models, visualizations, etc.

Quick Wrap

RStudio wants to make things easy on you!

Having a handy copy of the cheat sheets will serve you well!